Skip to content

Apply the builder pattern to OpenAiImageModel and add support for the new gpt-image-1 model #2943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dev-jonghoonpark
Copy link
Contributor

@dev-jonghoonpark dev-jonghoonpark commented Apr 30, 2025

Summary of Changes

  • Apply the builder pattern to OpenAiImageModel.
  • Add a new OpenAI image generation model: gpt-image-1.
    • new options added: background, moderation, output_compression, output_format

Related documentations about the new model:

Points to be aware of during testing (Differences from DALL-E 3):

  1. The new model does not support returning results via a URL.
  2. The revised_prompt field is not included in the response.

@dev-jonghoonpark dev-jonghoonpark marked this pull request as draft April 30, 2025 00:34
@dev-jonghoonpark

This comment was marked as resolved.

@dev-jonghoonpark dev-jonghoonpark force-pushed the add-new-open-ai-image-model branch from bcb15fe to e56583e Compare April 30, 2025 05:20
@dev-jonghoonpark dev-jonghoonpark changed the title Add new OpenAI image generation model gpt-image-1 Refactor OpenAI Image API and add support for the new gpt-image-1 model Apr 30, 2025
@dev-jonghoonpark dev-jonghoonpark force-pushed the add-new-open-ai-image-model branch 2 times, most recently from 870bb6d to f4071fa Compare April 30, 2025 05:37
@dev-jonghoonpark dev-jonghoonpark marked this pull request as ready for review April 30, 2025 05:42
@dev-jonghoonpark dev-jonghoonpark force-pushed the add-new-open-ai-image-model branch from f4071fa to 5e13f09 Compare May 1, 2025 20:27
@dev-jonghoonpark dev-jonghoonpark changed the title Refactor OpenAI Image API and add support for the new gpt-image-1 model Apply the builder pattern to OpenAiImageModel and add support for the new gpt-image-1 model May 1, 2025
@markpollack
Copy link
Member

Thanks for this!

@markpollack markpollack self-assigned this May 6, 2025
@markpollack markpollack added this to the 1.0.0-RC1 milestone May 6, 2025
@markpollack
Copy link
Member

not yet sure how to reconcile this with #2943 Also 9ca3a7d touches the same areas

@dev-jonghoonpark
Copy link
Contributor Author

dev-jonghoonpark commented May 6, 2025

What potential issues are anticipated in this regard?

Would you like additional test cases for the new options to be added, referring to OpenAiImageOptionsTests.java from commit 9ca3a7d?

@markpollack
Copy link
Member

markpollack commented May 11, 2025

@dev-jonghoonpark I didn't have time to look, just that it touched the same topic/area of the code. @sobychacko was looking into it but we ran into some key permission issues. Will get back to you. soon.

@markpollack
Copy link
Member

markpollack commented May 13, 2025

I just verified with OpenAI, but it is taking a long time to get access. In the meantime I changed the model in the tests to DALL_E_3 and then got failures as background , moderation, outputCompression , outputFormat, and quality are not known to that model.
Also after removoing all those options, got a test error related to the URL as you mentioned.

org.opentest4j.AssertionFailedError: 
expected: null
 but was: "https://oaidalleapiprodscus.blob.core.windows.net/private/org-mBGrU5or5nigb7PEh9VM8dqw/user-bmw1AxctHuQVRTHxTZLhW5iP/img-f28VODVsA8ZrucIO3D3M0JRu.png?st=2025-05-13T12%3A26%3A54Z&se=2025-05-13T14%3A26%3A54Z&sp=r&sv=2024-08-04&sr=b&rscd=inline&rsct=image/png&skoid=8b33a531-2df9-46a3-bc02-d4b1430a422c&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2025-05-13T03%3A46%3A03Z&ske=2025-05-14T03%3A46%3A03Z&sks=b&skv=2024-08-04&sig=8/NAC83sAJLj/5lGV51AM90JPM%2BU65n2eaSfiY6X/zk%3D"
Expected :null
Actual   :"https://oaidalleapiprodscus.blob.core.windows.net/private/org-mBGrU5or5nigb7PEh9VM8dqw/user-bmw1AxctHuQVRTHxTZLhW5iP/img-f28VODVsA8ZrucIO3D3M0JRu.png?st=2025-05-13T12%3A26%3A54Z&se=2025-05-13T14%3A26%3A54Z&sp=r&sv=2024-08-04&sr=b&rscd=inline&rsct=im ...

<Click to see difference>


	at org.springframework.ai.openai.image.OpenAiImageModelIT.gptImageModelTest(OpenAiImageModelIT.java:110)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)

This is going to lead to confusion, so I think we need to take a step back and look at the design and how to separate out different options for different image models. Possibly just dedicated options classes per model.

I had hoped to get this into RC1, but we will have to circle back for a 1.0.1 or other release. Thanks for PR though.

@markpollack markpollack modified the milestones: 1.0.0-RC1, 1.0.x May 13, 2025
@dev-jonghoonpark
Copy link
Contributor Author

dev-jonghoonpark commented May 13, 2025

@markpollack

As mentioned in the text (see Points to be aware of during testing (Differences from DALL-E 3)),
the new model does not support returning results via a URL.

Therefore, I set up a new separated test method, gptImageModelTest, specifically for GPT_IMAGE_1 model.
In this test method, DALL_E_3 model cannot be passed because it returns results via a URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants